Deep Denoising Auto-encoder for Statistical Speech Synthesis
نویسندگان
چکیده
This paper proposes a deep denoising auto-encoder technique to extract better acoustic features for speech synthesis. The technique allows us to automatically extract low-dimensional features from high dimensional spectral features in a non-linear, data-driven, unsupervised way. We compared the new stochastic feature extractor with conventional mel-cepstral analysis in analysis-by-synthesis and text-to-speech experiments. Our results confirm that the proposed method increases the quality of synthetic speech in both experiments.
منابع مشابه
Statistical Parametric Speech Synthesis Using Bottleneck Representation From Sequence Auto-encoder
In this paper, we describe a statistical parametric speech synthesis approach with unit-level acoustic representation. In conventional deep neural network based speech synthesis, the input text features are repeated for the entire duration of phoneme for mapping text and speech parameters. This mapping is learnt at the frame-level which is the de-facto acoustic representation. However much of t...
متن کاملPerception Optimized Deep Denoising AutoEncoders for Speech Enhancement
Speech Enhancement is a challenging and important area of research due to the many applications that depend on improved signal quality. It is a pre-processing step of speech processing systems and used for perceptually improving quality of speech for humans. With recent advances in Deep Neural Networks (DNN), deep Denoising Auto-Encoders have proved to be very successful for speech enhancement....
متن کاملSpeech enhancement with weighted denoising auto-encoder
A novel speech enhancement method with Weighted Denoising Auto-encoder (WDA) is proposed in this paper. A weighted reconstruction loss function is introduced to the conventional Denoising Auto-encoder (DA), and makes it suitable for the task of speech enhancement. First, the proposed WDA is used to model the relationship between the noisy and clean power spectrums of speech signal. Then, the es...
متن کاملPattern Recognition: Invariance Learning in Convolutional Auto Encoder Network
The ability of the human visual processing system to accommodate and retain clear understanding or identification of patterns irrespective of their orientations is quite remarkable. Conversely, pattern invariance, a common problem in intelligent recognition systems is not one that can be overemphasized; obviously, one‘s definition of an intelligent system broadens considering the large variabil...
متن کاملGradual training of deep denoising auto encoders
Stacked denoising auto encoders (DAEs) are well known to learn useful deep representations, which can be used to improve supervised training by initializing a deep network. We investigate a training scheme of a deep DAE, where DAE layers are gradually added and keep adapting as additional layers are added. We show that in the regime of mid-sized datasets, this gradual training provides a small ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1506.05268 شماره
صفحات -
تاریخ انتشار 2015